AITopics

2605.21692

Country:

North America (0.46)
Europe > United Kingdom (0.28)
South America > Brazil > Minas Gerais (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Tang, Yifan, Wang, Qiquan, García-Redondo, Inés, Monod, Anthea

Topological Signatures of Grokking

arXiv.org Machine LearningMay-8-2026

We study the grokking phenomenon through the lens of topology. Using persistent homology on point clouds derived from the embedding matrices of a range of models trained on modular arithmetic with varying primes, we identify a clear and consistent topological signature of grokking: a sharp increase in both the maximum and total persistence of first homology ($H_1$). Persistence diagrams reveal the emergence of a dominant long-lived topological feature together with increasingly structured secondary features, reflecting the underlying cyclic structure of the task. Compared to existing spectral and geometric diagnostics -- specifically, Fourier analysis and local intrinsic dimension -- persistent homology provides a unified geometric and topological characterization of representation learning, capturing both local and global multi-scale structure. Ablations across data regimes and control settings show that these topological transitions are tied to generalization rather than memorization. Our results suggest that persistent homology offers a principled and interpretable framework for analyzing how neural networks internalize latent structure during training.

artificial intelligence, machine learning, persistence, (16 more...)

2605.06352

Genre: Research Report > New Finding (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsApr-30-2026, 05:58:24 GMT

Stochastic_Preconditioners-7

J Sun

artificial intelligence, machine learning, matrix, (17 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Neural Information Processing SystemsApr-29-2026, 22:05:55 GMT

An Efficient Dataset Condensation Plugin and Its Application to Continual Learning

Dataset condensation (DC) distills a large real-world dataset into a small synthetic dataset, with the goal of training a network from scratch on the latter that performs similarly to the former. State-of-the-art (SOTA) DC methods have achieved satisfactory results through techniques such as accuracy, gradient, training trajectory, or distribution matching. However, these works all perform matching in the high-dimension pixel space, ignoring that natural images are usually locally connected and have lower intrinsic dimensions, resulting in low condensation efficiency. In this work, we propose a simple-yet-efficient dataset condensation plugin that matches the raw and synthetic datasets in a low-dimensional manifold.

artificial intelligence, machine learning, natural language, (19 more...)

Country: North America > United States (0.28)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

Neural Information Processing SystemsApr-25-2026, 11:02:44 GMT

Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks Supplementary Material

This document supplements our main paper entitled Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks as follows: (i) Sec. S1 firsts gives some of the formal definitions and interpretations omitted from the main paper due to space limitations. Next, it involves a discussion and contrasts our dimension estimator against the commonly used ones. Finally, it provides additional details into the regularizer we devised in the main paper; (ii) we then provide the complement the experimental evaluations given in the main paper and present additional studies on our synthetic diffusion data.

artificial intelligence, dimension, machine learning, (14 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Neural Information Processing SystemsApr-25-2026, 11:02:41 GMT

35a12c43227f217207d4e06ffefe39d3-Paper.pdf

artificial intelligence, dimension, machine learning, (15 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

James McQueen, Marina Meila, Dominique Joncas

Nearly Isometric Embedding by Relaxation

Neural Information Processing SystemsApr-22-2026, 07:02:42 GMT

Many manifold learning algorithms aim to create embeddings with low or no distortion (isometric). If the data has intrinsic dimension d, it is often impossible to obtain an isometric embedding in ddimensions, but possible in s > ddimensions. Yet, most geometry preserving algorithms cannot do the latter. This paper proposes an embedding algorithm to overcome this. The algorithm accepts as input, besides the dimension d, an embedding dimension s d.

algorithm, artificial intelligence, machine learning, (18 more...)

Country: North America > United States > Washington > King County > Seattle (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Martinez, Randy, Tang, Rong, Lin, Lizhen

A Deep Generative Approach to Stratified Learning

arXiv.org Machine LearningApr-14-2026

While the manifold hypothesis is widely adopted in modern machine learning, complex data is often better modeled as stratified spaces -- unions of manifolds (strata) of varying dimensions. Stratified learning is challenging due to varying dimensionality, intersection singularities, and lack of efficient models in learning the underlying distributions. We provide a deep generative approach to stratified learning by developing two generative frameworks for learning distributions on stratified spaces. The first is a sieve maximum likelihood approach realized via a dimension-aware mixture of variational autoencoders. The second is a diffusion-based framework that explores the score field structure of a mixture. We establish the convergence rates for learning both the ambient and intrinsic distributions, which are shown to be dependent on the intrinsic dimensions and smoothness of the underlying strata. Utilizing the geometry of the score field, we also establish consistency for estimating the intrinsic dimension of each stratum and propose an algorithm that consistently estimates both the number of strata and their dimensions. Theoretical results for both frameworks provide fundamental insights into the interplay of the underlying geometry, the ambient noise level, and deep generative models. Extensive simulations and real dataset applications, such as molecular dynamics, demonstrate the effectiveness of our methods.

artificial intelligence, logn, machine learning, (19 more...)

2604.1065

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Hong Kong > Kowloon (0.04)

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Rangamani, Akshay, Unal, Altay

Deep Neural Regression Collapse

arXiv.org Machine LearningMar-26-2026

Neural Collapse is a phenomenon that helps identify sparse and low rank structures in deep classifiers. Recent work has extended the definition of neural collapse to regression problems, albeit only measuring the phenomenon at the last layer. In this paper, we establish that Neural Regression Collapse (NRC) also occurs below the last layer across different types of models. We show that in the collapsed layers of neural regression models, features lie in a subspace that corresponds to the target dimension, the feature covariance aligns with the target covariance, the input subspace of the layer weights aligns with the feature subspace, and the linear prediction error of the features is close to the overall prediction error of the model. In addition to establishing Deep NRC, we also show that models that exhibit Deep NRC learn the intrinsic dimension of low rank targets and explore the necessity of weight decay in inducing Deep NRC. This paper provides a more complete picture of the simple structure learned by deep networks in the context of regression.

artificial intelligence, machine learning, urlhttp, (19 more...)

2603.23805

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)